AITopics

Country: Asia > China (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.67)
Information Technology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Neural Information Processing SystemsDec-24-2025, 19:16:46 GMT

Contrastive Language-Image Pre-Training with Knowledge Graphs

Recent years have witnessed the fast development of large-scale pre-training frameworks that can extract multi-modal representations in a unified form and achieve promising performances when transferred to downstream tasks. Nevertheless, existing approaches mainly focus on pre-training with simple image-text pairs, while neglecting the semantic connections between concepts from different modalities. In this paper, we propose a knowledge-based pre-training framework, dubbed Knowledge-CLIP, which injects semantic information into the widely used CLIP model. Through introducing knowledge-based objectives in the pre-training process and utilizing different types of knowledge graphs as training data, our model can semantically align the representations in vision and language with higher quality, and enhance the reasoning ability across scenarios and modalities. Extensive experiments on various vision-language downstream tasks demonstrate the effectiveness of Knowledge-CLIP compared with the original CLIP and competitive baselines.

contrastive language-image pre-training, knowledge graph, name change, (7 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-20-2025

Learning More with Less: A Generalizable, Self-Supervised Framework for Privacy-Preserving Capacity Estimation with EV Charging Data

Arunan, Anushiya, Qin, Yan, Li, Xiaoli, Tan, U-Xuan, Poor, H. Vincent, Yuen, Chau

This manuscript has been accepted in IEEE Transactions on Industrial Informatics. Personal use of this material is permitted. Abstract--Accurate battery capacity estimation is key to alleviating consumer concerns about battery performance and reliability of electric vehicles (EVs). However, practical data limitations imposed by stringent privacy regulations and labeled data shortages hamper the development of generalizable capacity estimation models that remain robust to real-world data distribution shifts. While self-supervised learning can leverage unlabeled data, existing techniques are not particularly designed to learn effectively from challenging field data--let alone from privacy-friendly data, which are often less feature-rich and noisier . In this work, we propose a first-of-its-kind capacity estimation model based on self-supervised pre-training, developed on a large-scale dataset of privacy-friendly charging data snippets from real-world EV operations. Our pre-training framework, snippet similarity-weighted masked input reconstruction, is designed to learn rich, generalizable representations even from less feature-rich and fragmented privacy-friendly data. Our key innovation lies in harnessing contrastive learning to first capture high-level similarities among fragmented snippets that otherwise lack meaningful context. With our snippet-wise contrastive learning and subsequent similarity-weighted masked reconstruction, we are able to learn rich representations of both granular charging patterns within individual snippets and high-level associative relationships across different snippets. Bolstered by this rich representation learning, our model consistently outperforms state-of-the-art baselines, achieving 31.9%

artificial intelligence, inductive learning, machine learning, (17 more...)

doi: 10.1109/TII.2025.3613385

2510.05172

Genre: Research Report (0.40)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)

Neural Information Processing SystemsOct-9-2025, 01:23:27 GMT

903f778fe1341e5351b5b63e0e6b197f-Paper-Conference.pdf

artificial intelligence, machine learning, natural language, (16 more...)

Country: Asia > China (0.04)

Genre: Research Report (0.93)

Industry: Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.67)

arXiv.org Artificial IntelligenceJun-23-2025

CRIA: A Cross-View Interaction and Instance-Adapted Pre-training Framework for Generalizable EEG Representations

Liu, Puchun, Chen, C. L. Philip, He, Yubin, Zhang, Tong

The difficulty of extracting deep features from EEG data and effectively integrating information from multiple views presents significant challenges for developing a generalizable pretraining framework for EEG representation learning. However, most existing pre-training methods rely solely on the contextual semantics of a single view, failing to capture the complex and synergistic interactions among different perspectives, limiting the expressiveness and generalization of learned representations. To address these issues, this paper proposes CRIA, an adaptive framework that utilizes variable-length and variable-channel coding to achieve a unified representation of EEG data across different datasets. In this work, we define cross-view information as the integrated representation that emerges from the interaction among temporal, spectral, and spatial views of EEG signals. The model employs a cross-attention mechanism to fuse temporal, spectral, and spatial features effectively, and combines an attention matrix masking strategy based on the information bottleneck principle with a novel viewpoint masking pre-training scheme. Experimental results on the Temple University EEG corpus and the CHB-MIT dataset show that CRIA outperforms existing methods with the same pre-training conditions, achieving a balanced accuracy of 57.02% for multi-class event classification and 80.03% for anomaly detection, highlighting its strong generalization ability.

data mining, eeg data, machine learning, (19 more...)

2506.16056

Country: Asia > China > Guangdong Province (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Neural Information Processing SystemsJan-17-2025, 20:05:35 GMT

Contrastive Language-Image Pre-Training with Knowledge Graphs

contrastive language-image pre-training, knowledge graph, pre-training framework, (4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.66)

arXiv.org Artificial IntelligenceSep-25-2024

Unleashing the Power of Emojis in Texts via Self-supervised Graph Pre-Training

Zhang, Zhou, Tan, Dongzeng, Wang, Jiaan, Chen, Yilong, Xu, Jiarong

Emojis have gained immense popularity on social platforms, serving as a common means to supplement or replace text. However, existing data mining approaches generally either completely ignore or simply treat emojis as ordinary Unicode characters, which may limit the model's ability to grasp the rich semantic information in emojis and the interaction between emojis and texts. Thus, it is necessary to release the emoji's power in social media data mining. To this end, we first construct a heterogeneous graph consisting of three types of nodes, i.e. post, word and emoji nodes to improve the representation of different elements in posts. The edges are also well-defined to model how these three elements interact with each other. To facilitate the sharing of information among post, word and emoji nodes, we propose a graph pre-train framework for text and emoji co-modeling, which contains two graph pre-training tasks: node-level graph contrastive learning and edge-level link reconstruction learning. Extensive experiments on the Xiaohongshu and Twitter datasets with two types of downstream tasks demonstrate that our approach proves significant improvement over previous strong baseline methods.

emoji, node, representation, (12 more...)

2409.14552

Country:

Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
(2 more...)

arXiv.org Artificial IntelligenceFeb-4-2024

TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling

Dong, Jiaxiang, Wu, Haixu, Wang, Yuxuan, Qiu, Yunzhong, Zhang, Li, Wang, Jianmin, Long, Mingsheng

Time series pre-training has recently garnered wide attention for its potential to reduce labeling expenses and benefit various downstream tasks. Prior methods are mainly based on pre-training techniques well-acknowledged in vision or language, such as masked modeling and contrastive learning. However, randomly masking time series or calculating series-wise similarity will distort or neglect inherent temporal correlations crucial in time series data. To emphasize temporal correlation modeling, this paper proposes TimeSiam as a simple but effective self-supervised pre-training framework for Time series based on Siamese networks. Concretely, TimeSiam pre-trains Siamese encoders to capture intrinsic temporal correlations between randomly sampled past and current subseries. With a simple data augmentation method (e.g.~masking), TimeSiam can benefit from diverse augmented subseries and learn internal time-dependent representations through a past-to-current reconstruction. Moreover, learnable lineage embeddings are also introduced to distinguish temporal distance between sampled series and further foster the learning of diverse temporal correlations. TimeSiam consistently outperforms extensive advanced pre-training baselines, demonstrating superior forecasting and classification capabilities across 13 standard benchmarks in both intra- and cross-domain scenarios.

representation, time sery, timesiam, (15 more...)

2402.02475

Country:

Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.92)
Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

arXiv.org Artificial IntelligenceDec-24-2023

Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction

Luo, Da, Gan, Yanglei, Hou, Rui, Lin, Run, Liu, Qiao, Cai, Yuxiang, Gao, Wannian

Few-shot Relation Extraction (FSRE) aims to extract relational facts from a sparse set of labeled corpora. Recent studies have shown promising results in FSRE by employing Pre-trained Language Models (PLMs) within the framework of supervised contrastive learning, which considers both instances and label facts. However, how to effectively harness massive instance-label pairs to encompass the learned representation with semantic richness in this learning paradigm is not fully explored. To address this gap, we introduce a novel synergistic anchored contrastive pre-training framework. This framework is motivated by the insight that the diverse viewpoints conveyed through instance-label pairs capture incomplete yet complementary intrinsic textual semantics. Specifically, our framework involves a symmetrical contrastive objective that encompasses both sentence-anchored and label-anchored contrastive losses. By combining these two losses, the model establishes a robust and uniform representation space. This space effectively captures the reciprocal alignment of feature distributions among instances and relational facts, simultaneously enhancing the maximization of mutual information across diverse perspectives within the same relation. Experimental results demonstrate that our framework achieves significant performance enhancements compared to baseline models in downstream FSRE tasks. Furthermore, our approach exhibits superior adaptability to handle the challenges of domain shift and zero-shot relation extraction. Our code is available online at https://github.com/AONE-NLP/FSRE-SaCon.

learning, proceedings, representation, (12 more...)

2312.12021

Country:

Asia > China (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Abbaspourazad, Salar, Elachqar, Oussama, Miller, Andrew C., Emrani, Saba, Nallasamy, Udhyakumar, Shapiro, Ian

Large-scale Training of Foundation Models for Wearable Biosignals

arXiv.org Artificial IntelligenceDec-8-2023

Tracking biosignals is crucial for monitoring wellness and preempting the development of severe medical conditions. Today, wearable devices can conveniently record various biosignals, creating the opportunity to monitor health status without disruption to one's daily routine. Despite widespread use of wearable devices and existing digital biomarkers, the absence of curated data with annotated medical labels hinders the development of new biomarkers to measure common health conditions. In fact, medical datasets are usually small in comparison to other domains, which is an obstacle for developing neural network models for biosignals. To address this challenge, we have employed self-supervised learning using the unlabeled sensor data collected under informed consent from the large longitudinal Apple Heart and Movement Study (AHMS) to train foundation models for two common biosignals: photoplethysmography (PPG) and electrocardiogram (ECG) recorded on Apple Watch. We curated PPG and ECG datasets from AHMS that include data from ~141K participants spanning ~3 years. Our self-supervised learning framework includes participant level positive pair selection, stochastic augmentation module and a regularized contrastive loss optimized with momentum training, and generalizes well to both PPG and ECG modalities. We show that the pre-trained foundation models readily encode information regarding participants' demographics and health conditions. To the best of our knowledge, this is the first study that builds foundation models using large-scale PPG and ECG data collected via wearable consumer devices $\unicode{x2013}$ prior works have commonly used smaller-size datasets collected in clinical and experimental settings. We believe PPG and ECG foundation models can enhance future wearable devices by reducing the reliance on labeled data and hold the potential to help the users improve their health.

classification 0, participant, ppg and ecg, (13 more...)

2312.05409

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts (0.04)
North America > Greenland (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Consumer Health (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)